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A quantitative form of the Nullity Theorem is presented, which establishes a linear relation 
between the singular values of the two submatrices involved in the theorem up to the first order. 
The theorem is then extended to function spaces and a corresponding form in infinite dimension is 
discussed. 
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The Nullity Theorem is concerned with submatrices of a block matrix T and its inverse. The theorem says that 
C^l ' complementary submatrices have the same nullity - a basic result in linear algebra that deserves to be better known. 
. This paper responds to a problem posed by Gilbert Strang, to find estimates of singular values when they are nearly 
but not exactly zero. Roughly speaking, if a submatrix of T almost has nullity k (meaning that its (fe + l)st smallest 
singular value is an infinitesimal), then the complementary submatrix of T _1 almost has this nullity. 

At the end we consider extensions to operators on function spaces. The Nullity Theorem does not explicitly involve 
the orders of the submatrices (unless it is stated as an equality for their ranks). It extends to infinite dimensions and 
we look for a corresponding quantitative form. 

Banded matrices arise from various contexts in applied mathematics and physics, especially the tridiagonal matrices. 
They are of interest partly because the number of nonzero entries of a banded matrix is linear to its size, which makes 
their manipulations much simpler than a full matrix. However, the inverse of a banded matrix is not banded. 
Therefore, it requires more work to reduce the complexity of the computations involving its inverse. Fortunately, the 
i-^h ■ classical theorem, which we will present in the next section, asserts that the off-diagonal submatrices of the inverse 
C$ , matrix have low ranks. This theorem makes our calculation much easier: if an n x n matrix A is of rank k, then 
A = BC for some n x k matrix B and some k x n matrix C . 

The matrices that we encounter in practice may not be banded, but their entries out of a band are small. Thus 
we may still want to use banded matrices to simplify the computation. For example, in solid state physics, the 
tight-binding model produces a tridiagonal Hamiltonian, but taking into account the "long-range" interactions yields 
an approximate tridiagonal Hamiltonian |l[ . If we focus on the off-diagonal blocks of the inverse matrix, this process 
can be interpreted as the approximation of the original blocks by low rank matrices. 
In this paper, we will study the error that arises from this approximation. 
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THE NULLITY THEOREM 



OO . 

The nullity theorem has been proved in many articles, including [2| and [3[. We will state this theorem in a notation 
convenient for us. The nullity of a matrix B means the dimension of its kernel. 

. Theorem 2.1. If ( ^ ^ ) = ( f ) > then dimkcrB = dimkcrC. In other words, nullity(-B) = nullity(C). 

We say that an M by M matrix K = (fey) is a banded matrix of bandwidth p, if kij = whenever \i — j\ > p. In 

(A B \ ( E C \ 

q £) \ an d K~ x = ( p J , where A is n x (n + p) 

and C is (n + p) x (M — n). Since the bandwidth of K is p, B = 0. Hence, dimkerC=dimkerB = M — n — p, so 
rankC = p, which is independent of the size of K. 

Corollary 2.2. If K is a banded matrix of bandwidth p, and C is a submatrix of K^ 1 above the pth subdiagonal, 
then rankC < p. 

This is the precise statement of the low off-diagonal rank property of K~ l . We will use the notation of this 
corollary throughout. By symmetry, a similar result holds for the lower off-diagonal submatrices of K . 
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3. THE MAIN THEOREM 



Now we consider the case in which ||B|| is small but nonzero. We want to replace C by some matrix L of rank p. 
The first attempt is quite intuitive. We treat the non-zero block B as a perturbation and look at the unperturbed 

matrix Kq = (^q arL< ^ the u PP er right block of its inverse, Co, which is the first attempt we make to approximate 
C. Let AA" = ( q q J and e = ||AA||. When e < l/HA^H, we have 



K- 1 = (A + AAT 1 = AV 1 ^-lHA^AA)" (3.1) 



n=0 



Therefore, 



||C -Co|| < ha^-av 1 !! 

oo 

\\K^\?e 
l-II^He 

This is a nice estimate in the sense that it bounds the error ||C— C 1| to a function of e no matter what the perturbation 
B is. However, Co is not necessarily the best approximation. Recall that [3] 

o- fc (C) = inf{||C - L\\ : ranki < k - 1} (3.2) 

Hence, the minimum error that arises from replacing C by a rank-p matrix is cr p +i(C), and the SVD also determines 
L that minimizes the error. We are concerned with the upper bound of o~ p +i(C) when B varies. Let L denote the 

best approximation of rank p to C and J = ^ ^ ^ J . Then we have the following 

Lemma 3.3. If e = \\B\\ is sufficiently small, then J is invertible, and J -1 = ^ ^ j~ ^ for some A,G,D. Further- 
more, B = A(L - C)D and A — A = A(L - C)G. 
Proof. J is invertible because 

lA^lPe 



| J- A" 1 1| = \\L - C\\ = a p+1 (C) < 



i-ll^lk' 



which is less than 1/||A|| if e is sufficiently small. Here we have used our first estimate. Let 



J- 1 = 



A B 

g b 



Then B = by the nullity theorem. 

To prove the second statement, we notice that 
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GE + DH GL + DF I { G D 



However, 
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n 



so AE + BE = I and AL + BF = A(L - C) since AC + BE = 0. Hence, 

A B\ _ ( I A(L-C)\f A \ _ f A + A(L-C)G A(L - C)D 
G D ) \* * J \ G D 

Therefore, B = A(L - C)D and A — A = A(L - C)G. 

Now we can state and prove our main theorem. 
Theorem 3.4. For sufficiently small e, 

CT p+ l(C) 



|Bjj<e a p (A)a M -n- P (D) 

0~n{A)o- M -n-p{D)o- p+ l(C) 



Proof. We first show that 



0(e) 



< \\B\ 



P- 



l + a p+1 (C)\\GL\\/a p (C) 
. The best approximation L is given by the SVD as follows: if 

/<Tl(C) 

C = Q 

V a k (C) 
where Q and P are orthogonal matrices and k = min{n +p, M — n}, then 

( a^C) \ 

L = Q 







V 



P- 1 



0/ 



By the lemma, we have 



/0 



B = A(L - C)D = AQ 



-<7 p+ l(C) 







-<r k (C) ) 



P- 1 !) 







□ 



(3.5) 



(3.6) 



(3.7) 



Since LD = by construction, P 1 D looks like I ^ J , where Do is an (M — n — p) x (M — n — p) matrix and 

&M-n-p(E>o) = o~M-n-p{D). Let AQ = ( A\ A2), where A\ has p columns and A2 is an n by n matrix. By the 
lemma, AL = A(L - C)GL + AL = A(L - C)GL. We notice that 



(a^C) \ 



AL = AQ 



a p (C) 



01(C) 



= A X 



0/ 







a p (C) 



Hence, 



On the other hand, 



lA^iO^WALW^WAiL-OWWGLl 
( -a p+1 (C) \ 



(3.8) 



A(L -C) = A 2 



a k (C) 



\ 
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Therefore, 

\\A(L — C0|| > * n (A 2 )a p+1 (C) > (<r n (A) - WA^+AC). 
We combine (13.81) and (13.91) to obtain 



(3.9) 



||A(L-C)|| P(L-C)||||GL|| 

cr "(- 4 ) 77^~ < 11^1 II < 



Hence, 



cr n (^4)cr p+ i(C) 



||A(L-C)||> 



l + a p+1 (C)||GL||/a p (C) 



Now we put all things together: 



||B|| = \\A(L-C)D\\ 

> \\A(L - C)\\a M -n- P (Do) 
— n—p (D) 



> 



l + a p+1 (C)\\GL\\/a p (C) 
Now we want to find a B such that the equality holds. We have the following SVDs: 



A = Qi 



D = Q 2 



V 



\ 



<7„(A) J 



Pi 



&M-n-p(D) 



p-1 



) 



Let T = (tij) be an (n + p) x (M — n) matrix, where 



U 



l+<T v+1 {G)\\GL\\/<T p {Gy 

0, otherwise 



if i = n and j = M — n — p; 



., <T n (A)cr p+1 {C)crM- n -p(D) 
14 

Let B = {B e K nx( - M - n -P^\\\B\\ < e}. Consider the map 

f:B 



Then Q7 1 (AP 1 TQ 2 D)P 2 has only one nonzero entry, an ^' ap+ ,]!^ l '^t 1 ~ n ~^' ■ Note that D and T are functions of B. 

1+0^+1(0 ) \\GL\\/a p {C) 



nnx (M— n— p) 



By the first half of the proof, 



B i — ► AP1TQ2 D 



lAPtTQ^D-V °n(A)<T p+1 (C)a M - n - p (D) 



< \\B\\<e 



l + a p+1 (C)\\GL\\/a p (C) 

Hence, /(B) C B. Now we apply Brouwer fixed point theorem to conclude that there exists B$ £ B such that 



S = /(B ) = APiTQ^ 1 !). Therefore, ||B | 



<J n (A)a p+1 (C)a M - n -v(D) 



l+a p+1 (C)\\GL\\/a p (C) 

a n (A)a p+ i(C)aM-n-p(D) 



So far, we have shown that 



sup - ^ - — = 1 (3.1(1;. 

m <e\\B\\(l + a p+1 (C)\\GL\\/a p (C)) 

Now we notice that the difference between quantities with tilde and those without tilde are of the order 0(e), and 
a p+ i(C) is also of order O(e). Therefore, 



sup 



<WC) _ 1 

^ ||B|| cj n (A)a M -n- P (D) 



0(e) 



□ 



This theorem gives the least upper bound of the error in terms of the smallest singular values of A and D, up to 
the first order of e. Eq (|3.10p even works for large e, though its complex form makes it hardly useful. 
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4. EXTENSION TO INFINITE DIMENSIONS 



The continuous counterpart of the singular value estimate may be formulated in the following way. Let Hq be an 
invertible operator acting on functions of single variable, whose Green's function has finite off-diagonal rank. To be 
specific, the space of functions is taken to be L 2 ([0, 1]). The symbol H does not suggest that it be Hcrmitian(or 
self- adjoint). Now suppose a perturbation eW is turned on, where ||W|| = 1. In a typical problem, Hq is a local 

operator such as the Schrodinger operator —j^z + V(x) and W is an integral operator. The equation satisfied by the 
Green's function is 



H G(x,x') + e f W(x,x")G(x",x')dx" = S(x-x') (4.1) 
Jo 

We can rewrite the above equation in terms of Dirac notation: 

H \il>)+eW\iP) = \x') (4.2) 

Here, \ip) is some function and (x\ip) = ip(x). In particular. (x\x') = 6(x — x 1 ). 

Before proceeding to estimate the error in the Green's function, we make some assumptions on the perturbation 
eW. First, we require that e < l-ffo" 1 !! in order to make the operator Ho + eW invertible. In the following derivation, 
we need the inverse of Hq to have finite off-diagonal rank. To understand this condition better, we write Hq and Hq 1 
in the block form: 

H ° = ( G D ) and H ^ = ( F H 

where A : L 2 ([0,xo]) — > L 2 ([0, xi]) is the restriction of Ho and < x\ < xq < 1. We are concerned with the Green's 
function G(x,x') where x < xq and x' > x\. We observe that AG = 0. Hence, ImC C KerA Now we assume that A 
is a Frcdholm operator, or more generally, a bounded operator with finite-dimensional kernal and closed image. Then 
C has finite rank. Moreover, since the index of a Fredholm operator is invariant under sufficiently small perturbations, 
we can absorb the diagonal part of W into Hq without increasing the rank of C. Without loss of generality, we can 
assume that (x'\ W \ x") ^ only if x' < x\ < xq < x" , or x" < xq and x' > x\. This is consistent with the assumption 
we have made in the discrete case. 

The standard perturbation theory yields that 

G(x,x') = (x\ip) 

= (x\ W) 

N 1 Hq + eW 1 

= (x\ (H^ 1 - eH^WH^ 1 + o(e 2 ) \x') 

= G (x, x')-e f G (x, x") (x"\ W \x'") G (x'", x')dx"dx"' + o(e 2 ), 

where Go is the Green's function of the unperturbed operator Hq. We will denote (x"\W \x"') by W(x" ,x"') from 
now on. 

By the assumption on W, W(x" , x'") ^ only if x" < x\ < xq < x'", or x'" < xq and x" > x\. Hence, the integral 
can be broken into two parts: 

G (x, x")W{x", x"')G {x"', x')dx"dx"' 

[0,1] x [0,1] 

1 dx"G {x, x") f dx"'W(x", x"')Gq{x"\ x') + 

dx"G (x,x") / dx"'W{x",x'")Go(x"',x') 
ii Jo 

The second term on the right hand side does not increase the rank since x < xq and x" > x\. Furthermore, ||W|| = 1, 
so the L 2 norm of the first term is bounded by where 



(71(72 
X 



(/ dy dy'\G (y,y')\ 2 )-?; (4.3) 
Jo Jo 

°2 = (C dy I' dy'\Go(y,y')\ 2 y?. (4.4) 
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They are the continuous counterparts of Uk(A) and a n -k+ p {D) in the discrete case. Combining all the above equations, 
we conclude that if the off-diagonal rank of H is n, then the (n + l)th singular value of the off-diagonal block of 
H + (W is bounded by + o(e 2 ). 



5. OPEN QUESTION 



The Nullity Theorem also applies to the case where the unperturbed matrix has a low off-diagonal rank, but not a 
banded matrix. We have assumed throughout that B in Theorem (|2.1|) equals zero. However, if rankB = k ^ 0, we 
still have a low rank off-diagonal block: rankC = p + k, by the Nullity Theorem. However, it is considerably harder 
to generalize this result to an approximate case. In particular, there is no known estimate for a p -\-f.+i (C)/ffc+i ■ 
Experiments indicate that the dependence of this ratio on A and D is very complicated and it may also depend on 
the first k singular values of B. Therefore, our result is not a full generalization of the Nullity Theorem. 



6. CONCLUSION 



In this paper, we have studied the nearly-banded matrices via the off-diagonal blocks of their inverses and obtained 
an estimate (Corollary (|3.10p ) for the (p + l)th singular value of an off-diagonal block, which in some sense measures 
the performance of the banded approximation. The result is then extended to a particular example of function spaces. 
This research is sponsored by the Lord Fund in MIT. The research is initiated by Gilbert Strang, and the extension 
to the continuous case is inspired by Steven Johnson, to whom the author is deeply grateful. 
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